OBJECT TRACKING BASED ON COLOR DISTRIBUTION 



BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates to the field of image processing, and in particular to the tracking of 
target objects in images based on the distribution of color, and particularly the hue and saturation 
of color pixels and the intensity of gray pixels. 

2. Description of Related Art 

Motion-based tracking is commonly used to track particular objects within a series of 
image frames. For example, security systems can be configured to process images from one or 
more cameras, to autonomously detect potential intruders into secured areas, and to provide 
appropriate alarm notifications based on the intruder's path of movement. Similarly, 
videoconferencing systems can be configured to automatically track a selected speaker, or a 
home automation system can be configured to track occupants and to correspondingly control 
lights and appliances in dependence upon each occupant's location. 

A variety of motion-based tracking techniques are available, based on the recognition of 
the same object in a series of images from a camera. Characteristics such as object size, shape, 
color, etc. can be used to distinguish objects of potential interest, and pattern matching 
techniques can be applied to track the motion of the same object from frame to frame in the 
series of images from the camera. In the field of image tracking, a 'target' is modeled by a set of 
image characteristics, and each image frame, or subset of the image frame, is searched for a 
similar set of characteristics. 

Precise and robust target modeling, however, generally requires high-resolution, and the 
comparison process can be computationally complex. This computational complexity often 
limits target tracking to very high-speed computers, or to off-line (i.e. non-real-time) processing. 
In like manner, the high-resolution characterization generally requires substantial memory 
resources for containing the detailed data of each target and each image frame. 



702055A PATENT APPLICATION 



1 



6 May 2001 



BRIEF SUMMARY OF THE INVENTION 
It is an object of this invention to provide a target tracking system and method that is 
computationally efficient while also being relatively accurate. It is a further object of this 
invention to provide a target modeling system and method that uses a relatively small amount of 
5 memory and/or processing resources. 

These objects and others are achieved by providing a color modeling and color matching 
process and system that uses the hue and saturation of color pixels, in conjunction with the 
intensity of gray or near-gray pixels, to characterize targets and images. A target is characterized 
by a histogram of hues and saturation within the target image, with a greater distinction being 
10 provided to the hues. Recognizing that the hue of gray, or near-gray, picture elements (pixels) is 
highly sensitive to noise, the gray or near-gray pixels are encoded as a histogram of intensity, 
rather than hue or saturation. The target tracking system searches for the occurrence of a similar 
s i set of coincident color-hue-saturation and gray-intensity histograms within each of the image 
CO frames of a series of image frames. To further simplify the computation and storage tasks, targets 
% 5 are defined in terms of a rectangular segment of an image frame. Recursive techniques are 
Q employed to reduce the computation complexity of the color-matching task. 

^ BRIEF DESCRIPTION OF THE DRAWINGS 

Ul The invention is explained in further detail, and by way of example, with reference to the 

|,£0 accompanying drawings wherein: 

F J FIG. 1 illustrates an example flow diagram of an image tracking system in accordance with this 
invention. 

FIG. 2 illustrates an example block diagram of an image tracking system in accordance with this 
invention. 

25 FIG. 3 illustrates an example flow diagram for creating a composite histogram of color hue and 
saturation, and gray intensity characteristics in accordance with this invention. 

Throughout the drawings, the same reference numerals indicate similar or corresponding 
features or functions. 
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DETAILED DESCRIPTION OF THE INVENTION 
FIG. 1 illustrates an example flow diagram of an image tracking system 100 in 
accordance with this invention. Video input, in the form of image frames is continually received, 
at 1 10, and continually processed, via the image processing loop 140-1 80. At some point, either 
5 automatically or based on manual input, a target is selected for tracking within the image frames, 
at 120. After the target is identified, it is modeled for efficient processing, at 130. At block 140, 
the current image is aligned to a prior image, taking into account any camera adjustments that 
may have been made, at block 180. After aligning the prior and past images in the image frames, 
the motion of objects within the frame is determined, at 150. Generally, a target that is being 
1 0 tracked is a moving target, and the identification of independently moving objects improves the 
efficiency of locating the target, by ignoring background detail. At 160, color matching is used 
to identify the portion of the image, or the portion of the moving objects in the image, 
% corresponding to the target. Based on the color matching and/or other criteria, such as size, 

?! shape, speed of movement, etc., the target is identified in the image, at 170. 

1 in 

2 5 In an integrated security system, the tracking of a target generally includes controlling 
*3 one or more cameras to facilitate the tracking, at 180. In a multi-camera system, the target 

«5 tracking system 100 determines when to "hand-off the tracking from one camera to another, for 
H example, when the target travels from one camera's field of view to another. In either a single or 
f 1 multi-camera system, the target tracking system 100 may also be configured to adjust the 
|€0 camera's field of view, via control of the camera's pan, tilt, and zoom controls, if any. 
tl Alternatively, or additionally, the target tracking system 100 may be configured to notify a 

security person of the movements of the target, for a manual control of the camera, or selection 

of cameras. 

As would be evident to one of ordinary skill in the art, a particular tracking system may 
25 contain fewer or more functional blocks than those illustrated in the example system 1 00 of FIG. 
1 . Not illustrated, the target tracking system 100 may be configured to effect other operations as 
well. For example, in a security application, the tracking system 100 may be configured to 
activate audible alarms if the target enters a secured zone, or to send an alert to a remote security 
force, and so on. In a home-automation application, the tracking system 100 may be configured 
30 to turn appliances and lights on or off in dependence upon an occupant's path of motion, and so 
on. 
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The tracking system is preferably embodied as a combination of hardware devices and 
one or more programmed processors. FIG. 2 illustrates an example block diagram of an image 
tracking system 200 in accordance with this invention. One or more cameras 210 provide input 
to a video processor 220. The video processor 220 processes the images from one or more 
5 cameras 210, and stores target characteristics in a memory 250, under the control of a system 
controller 240. In a preferred embodiment, the system controller 240 also facilitates control of 
the fields of view of the cameras 210, and select functions of the video processor 220. As noted 
above, the tracking system 200 may control the cameras 210 automatically, based on tracking 
information that is provided by the video processor 220. 

10 

This invention primarily addresses the color matching task 160, and the corresponding 
target modeling task 130, and target identification task 170 used to effect the color matching 
Z process of this invention. The color matching process is based on the observation that some 

0 visual characteristics are more or less sensitive to environmental changes, such as lighting, 

Li 5 shadows, reflections, and so on. For ease of reference, uncontrolled changes in conditions that 

1 affect visual characteristics is herein termed 'noise'. 

p It has been found that the noise experienced in a typical environment generally relates to 

changes in the brightness of objects, as the environmental conditions change, or as an object 
^ travels from one set of environmental conditions to another. In a preferred embodiment of this 
€0 invention, a representation that provides a separation of brightness from chromacity is used, to 
i provide a representation that is robust to changes in brightness while still retaining color 

information. Experiments have shown that the HSI (Hue, Saturation, Intensity) color model 
provides a better separation between brightness and chromacity than the RGB (Red, Green, 
Blue) color model that is typically used in video imaging. Hue represents dominant color as 
25 perceived by an observer; saturation represents the relative purity, or the amount of white mixed 
with the color; and intensity is a subjective measure that refers to the amount of light provided 
by the color. Other models, such as YUV, or a model specifically created to distinguish 
brightness and chromacity, may also be used. 

FIG. 3 illustrates an example flow diagram for creating a composite histogram of color 
30 hue and saturation, and gray intensity characteristics in accordance with this invention, as may 
be used in block 160, and corresponding block 130, in FIG. 1. It is assumed herein that the input 



702 055A PATENT APPLICATION 



4 



6 May 2001 



image comprises RGB color components, although the source may provide YUV components, or 
others, and it is assumed that an HSI color model is being used for characterizing the image. The 
RGB image is converted to an HSI image, at 310. The equations for effecting this conversion are 
provided below; equations for converting to and from other color model formats are generally 
5 known to those skilled in the art. 
I = U3{R + G + B) 

,[3 R-I 

H = cos < — f= — ■■ — = = — ==r ' 

\2J(r-gJ + (r-bXg-bX 

,40 The intensity component, I, can be seen to correspond to an average magnitude of the 

5 color components, and is substantially insensitive to changes in color and highly sensitive to 
I S changes in brightness. The hue component, H, can be seen to correspond to relative differences 
X between the red, green, and blue components, and thus is sensitive to changes in color, and fairly 

insensitive to changes in brightness. The saturation component, S, is based on a ratio of the 
*" 1 5 minimum color component to the average magnitude of the color components, and thus is also 
f 2 fairly insensitive to changes in brightness, but, being based on the minimum color component, is 
i i also somewhat less sensitive to changes in color than the hue component. 
lZ Note, however, that the hue component, being based on a relative difference between 

color components, is undefined (nominally 0) for the color gray, which is produced when the 
20 red, green, and blue components are equal to each other. The hue component is also highly 
variable for colors close to gray. For example, a 'near' gray having an RGB value of (101, 100, 
100) has a HSI value of (0, 0.0033, 100.333) whereas an RGB value of (100, 101, 100) produces 
a HSI value of (2.09, 0.0033, 100.333), even though these two RGB values are virtually 
indistinguishable (as evidenced by the constant values of saturation and intensity). Similar 
25 anomalies in hue and saturation components occur for low-intensity color measurements as well. 

Experiments have confirmed that both the hue and saturation components are effective 
for distinguishing color, and that the hue component is more robust than the saturation 
component for distinguishing true color, but highly sensitive to noise for gray or near gray 
colors, or colors with an overall low intensity level. For ease of reference, colors with very low 
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intensity levels are herein defined as non-colors, because the color of a very low intensity pixel 
is substantially indistinguishable from black (or dark gray), and/or because determining the true 
color components of a low intensity input signal to a camera has a high noise factor. 

In accordance with this invention, separate histograms are used to characterize color (i.e. 
5 non-gray) pixels from non-color (i.e. gray, or near-gray, or low-intensity) pixels. A composite of 
these two histograms is used for target characterization and subsequent color matching within an 
image to track the motion of the characterized target. As illustrated in FIG. 3, at 320, gray, or 
near-gray, pixels (R~G~B) are identified, preferably by defining all colors that lie within a toroid 
of the R=G=B line in the RGB color space to be near-gray. The radius of the toroid defines the 
10 boundary for defining each pixel as either non-gray (color) or gray (non-color), and is preferably 
determined heuristically. Generally a radius of less than ten percent of the maximum range of the 
color values is sufficient to filter gray pixels from color pixels. 
0 a histogram is created for each color pixel, at 3 30, for recording the occurrence of each 

£0 hue-saturation pair. Because hue has been found to be a more sensitive discriminator of color, 
1 5 the resolution of the histogram along the hue axis is finer than the resolution along the saturation 
O axis. In a preferred embodiment, the hue axis is divided into 32 hue values and the saturation 
J axis is divided into 4 saturation values, for a total of 128 histogram 'bins' for containing the 
U distribution of hue-saturation pairs contained within the target. At 340, a histogram of intensity 
W levels of the gray pixels is created, nominally as few as 1 6 different levels of intensity are 
U0 sufficient to distinguish among gray objects, in combination with the color histogram 
F: 1 information. These two histograms form a composite histogram that is used to characterize the 
target. The composite histogram contains a total number of 'bins' that is equal to the sum of the 
number of different hue-saturation pairs and intensity levels. 

By maintaining a histogram of color information after filtering out gray pixels, in 
25 accordance with this invention, efficient and effective color discrimination can be achieved, 
without the variance typically associated with color discrimination among gray, or near-gray, 
pixels or objects. By maintaining a histogram of intensity information for gray pixels only, 
efficient and effective discrimination can be achieved, without the variance typically associated 
the intensity measure of color pixels under different lighting conditions. 
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In a preferred embodiment, the composite histogram of the target is compared to 
similarly determined histograms corresponding to regions of the image of substantially the same 
size and shape as the target. Preferably, to simplify the comparison process, targets are identified 
as rectangular objects, or similarly easy to define region shapes. Any of a variety of histogram 
5 comparison techniques can be used to determine the region in the image that most closely 
correspond to the target, corresponding to block 170 in FIG. 1. The selected histogram 
comparison technique determines the characteristics of the target that are stored in the target 
characteristics memory 250 of FIG. 2 by the target modeling block 130 of FIG. 1. In a preferred 
embodiment of this invention, the composite histogram, containing both color (hue-saturation) 
1 0 and non-color (intensity) frequency counts is used, although the color and non-color histograms 
may be processed independently to determine a corresponding region in each image that is 
processed. If the histograms are processed independently, different histogram comparison 
H techniques may be applied to the color histogram and the non-color histogram. 
05 In a preferred embodiment of this invention, a fast histogram technique as described in 

% 5 copending application "PALETTE-BASED HISTOGRAM MATCHING", U.S. patent 

0 application number , filed for Miroslav Trajkovic, Attorney Docket 

1 USO 1 0239, and incorporated by reference herein, is used for finding a similar distribution of 

U target color and non-color pixels in an image. A histogram vector, containing the N most popular 
W values in the target (of either hue-saturation or intensity) is used to characterize the target, in lieu 
H20 of the entirety of possible color and non-color values forming the histogram. The target- 
f j modeling block 1 30 of FIG. 1 stores this N-element vector, and an identification of the color or 
intensity corresponding to each element of the vector, as the target characteristics, in memory 
250 of FIG. 2. That is, using the example parameters presented above, the target histogram has a 
total of 128 possible hue-saturation pairs (32 hue levels X 4 saturation levels). Assume in this 
25 example that eight intensity levels are used to characterize the non-color pixels, thereby 
providing a total of 136 possible histogram classes, or 'bins', for counting the number of 
occurrences of chromatic (hue-saturation) values or gray scale (intensity) levels in the target. For 
ease of reference, the term composite value is used hereinafter to refer to either a hue-saturation 
pair or an intensity level, depending upon whether the pixel is classified as color or non-color. In 
30 a preferred embodiment, the sixteen most frequently occurring composite values in the target 
form a 16-element vector. An identification of each of these composite values, and the number 
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of occurrences of each composite value in the target, is stored as the target characteristics in 
memory 250. The set of composite values forming the target histogram vector is termed the 
target palette, each of the N most frequently occurring composite values being termed a palette 
value. 

5 To effect the color comparison in block 170 of FIG. 1, the image is processed to identify 

the occurrences of the target palette values in the image. All other composite values are ignored. 
A palette image is formed that contains the identification of the corresponding target palette 
value for each pixel in the image. Pixels that contain composite values that are not contained in 
the target palette are assigned a zero, or null, value. A count of each of the non-zero entries in a 
10 target-sized region of the image forms the histogram vector corresponding to the region. Thus, 
by ignoring all image pixel values that are not contained in the target palette, the time required to 
create a histogram vector for each target-sized region in the image is substantially reduced. The 
hi referenced co-pending application also discloses a recursive technique for further improving the 
K speed of the histogram creation process. The similarity measure of each region to the target is 
J 5 determined as: 

| S = £mm(hR k9 hT k ), 

n where hR is the histogram vector of the region, hT is the histogram vector of the target, and n is 
l f} the length, or number of dimension, in each histogram vector. The region with the highest 
U similarity measure, above some minimum normalized threshold, is defined as the region that 
t Jo contains the target, based on the above described color and non-color matching. 

The foregoing merely illustrates the principles of the invention. It will thus be 
appreciated that those skilled in the art will be able to devise various arrangements which, 
although not explicitly described or shown herein, embody the principles of the invention and 
25 are thus within the spirit and scope of the following claims. 
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